Performance factor analysis for the 2012 NIST speaker recognition evaluation

نویسندگان

  • Alvin F. Martin
  • Craig S. Greenberg
  • Vincent M. Stanford
  • John M. Howard
  • George R. Doddington
  • John J. Godfrey
چکیده

The 2012 NIST Speaker Recognition Evaluation, held in the autumn of 2012, was designed to examine a variety of factors affecting the performance of automatic systems for speaker recognition. Here we examine, for leading systems included in this evaluation, the observed effects on performance of five such factors: the inclusion in test segment speech of environmental noise or of added synthetic noise of one of three types and one of two intensity levels, the duration of test segment speech, the number and the channel type of target speaker training sessions, the type of the microphone channel used in test segment speech, and the sex of the target speaker. This evaluation is notable for being the first in the series to include examination of the effects on performance of synthetic added noise. The greater impact of crowd noise compared to HVAC noise, and of single speaker noise compared to crowd noise is observed. Future evaluation plans are also discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The I3a speaker recognition system for NIST SRE12: post-evaluation analysis

The I3A submission for the recent NIST 2012 speaker recognition evaluation (SRE) was based on the i-vector approach with a multi-channel PLDA classifier. This PLDA is modified so that, for each i-vector, the between-class covariance depends on the type of channel where the segment was recorded (telephone,interviews,clean, noisy, etc). In this paper, we present the description of our submission ...

متن کامل

Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition

Gaussian mixture model support vector machine (GMMSVM) with nuisance attribute projection (NAP) has been found to be effective and reliable for speaker and language recognition. In maximum a posteriori (MAP) adaptation of GMM, the relevance factor is the parameter that regulates how much the adaptation data affect the base model, which impacts the final recognition performance. In our previous ...

متن کامل

SHEEP, GOATS, LAMBS and WOLVES: a statistical analysis of speaker performance in the NIST 1998 speaker recognition evaluation

Performance variability in speech and speaker recognition systems can be attributed to many factors. One major factor, which is often acknowledged but seldom analyzed, is inherent differences in the recognizability of different speakers. In speaker recognition systems such differences are characterized by the use of animal names for different types of speakers, including sheep, goats, lambs and...

متن کامل

UTD-CRSS Systems for 2012 NIST Speaker Recognition Evaluation The CRSS SRE Team

This document briefly describes the systems submitted by the Center for Robust Speech Systems (CRSS) from The University of Texas at Dallas (UTD) for the 2012 NIST Speaker Recognition Evaluation. We developed a state-of-the-art i-vector based speaker recognition system [1]. Probabilistic linear discriminant analysis (PLDA) [2] along with several other backends are used for channel/noise compens...

متن کامل

The MIT lincoln laboratory 2008 speaker recognition system

In recent years methods for modeling and mitigating variational nuisances have been introduced and refined. A primary emphasis in this years NIST 2008 Speaker Recognition Evaluation (SRE) was to greatly expand the use of auxiliary microphones. This offered the additional channel variations which has been a historical challenge to speaker verification systems. In this paper we present the MIT Li...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014